Exploratory Data Analysis on Raw Drilling Data

Task to do

This notebook will guide you to extract drilling data from CNLOPB website.

The overall goal is to extract drilling data from CNLOPB repository and perform exploratory data analysis.

Data Source

CNLPOB - Canada-Newfoundland & Labrador Offshore Petroleum Borad (https://www.cnlopb.ca/)

In this notebook, we will explore drilling data from CNLOPB website.

Go To CNLOPB websie and Search for Well Hibernia B-16 38. Yon can also find data in the link below.

https://home-cnlopb.hub.arcgis.com/pages/well-inventory

Check Inventory & LAS section for this well (Hibernia B-16 38)

Alternatively, you can find inventory and LAS file loaction in the link below

https://home-cnlopb.hub.arcgis.com/pages/hibernia-b-16-38

Now we will investigate the content for 311 mm Section

Download the file LAS-011361 for Hibernia B-16 38 - FDP 311 mm Section as shown in the image below-

image-2.png

Alternatively you can download the LAS data file from the link below- https://cnlopb.maps.arcgis.com/sharing/rest/content/items/9b346dbd45f74aaa836dbfed4c8b3aab/data

Try to open this file using the code line below.

We got the following error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 3039: invalid start byte

These files originally created as LAS format, we will rename the fie extension from 'csv to 'las and load again.

To run a LAS file in python, we need to install lasio library. If you do not have lasio installed, please install it.

Profile Report

Profile report generate a comprehensive report for all data.

Run next cell and this will generate a report. Explore different features on this report. It will take few minutes to generate the report based on the amount of data

Save profile report as html file

Data Type

Change TIME from object to date time format

Let's have a look at the missing data

The first index coloum shows the Epoch time based on 1900. We will ignore this index since we have a seperate TIME stamp within the dataframe

For the rest of the task, we will choose the following data points for our analysis

Draw time vs depth profile

This will show complete drilling activities for this section (311 mm section)

The above figure shows an inverted view of the drilling activity with time. We will use 'plt.gca().invert_yaxis()' function to invert the y axis

The above figure contains both drilling and non drilling activities with time. Therefore, we will seperate drilling data from non drilling activity

Seperate drilling and not drilling data

In this dataset We will consider an operation as drilling when Rate of Penetration (ROP5), Surface Weight on Bit (SWOB) and Rotary Speed (RPM) has a positive value.

Also, Composite On Bottom Status= 0 represents a drilling state

We will rename drilling data as df

Plot active drilling data

Seperate data based on Composite on Bottom Status

Alternatively we can seperate drilling and non drilling data by composite on bottom status.

COBTM = 0 represent a drilling status

COBTM = 1 represent a non-drilling status

Adding Geo Tag to the Surface Data

In this steps we will add geological tag to the existing data.

Geological marker data available from Hibernia 1638 END of Well Report.

End of well report available as 'Hibernia B-16 38 - End of Well Report' with file ID INV-124388.

image.png

Downholad the file from this link https://home-cnlopb.hub.arcgis.com/pages/hibernia-b-16-38

Check pdf page numer 405 onwards

Statistical Distribution

Statistical Distribution of Data

In this step, we will see boxplot and violin plot to see the data distribution. Also these plot help us to see outlier distribution

Histogram

KDE plot

A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset

Boxplot

Violin Plot

Check multiple catagorical data together

In this step, we will see the SWOB violin plot distribtion by formation

Scatter Plot

Interactive Plot

We can draw interactive plot using plotly function.

At the bottom of this plot, you can see different legend.

You can click on the legend to very specific distribution of any data point/ series of data.

You can also hover over any data point to get properties at this point.

The above figure shows a few extreme data points for SMSE on Y axis. We can further apply filter to see a more detailed view

Heatmanp

We will further reduce our datafrmae and see the correlation between the data points

Let us create a seperate dataframe as df2 with reduced dataset.

We will use df2.corr() function to get the correlation between data.

Finally, we will use heatmap to visualize correlation

Averaging High Density Data

Raw field data shows lots of noise and sometime we need to smooth data with different signal processing. Let us see SWOB data with depth

Now we will use Rolling Mean function to smooth data. We set a rolling window of 50. This means SWOB data will be averaged over 50 data points.

We can plot both raw data and processed data together

Subplot

Adding formation Marker to The Plot

Save data to csv file

Again load the data file from same location

Thank you

Email your questions/ comments/suggesstions: mmh710@mun.ca

https://github.com/mojammelhuque/Drilling-Data-Analytics